Time - Space Trade-Offs in Scaling up RDF Schema Reasoning

نویسندگان

  • Heiner Stuckenschmidt
  • Jeen Broekstra
چکیده

A common way of reducing run time complexity of RDF Schema reasoning is to compute (parts of) the deductive closure of a model offline. This reduces the complexity at run time, but increases the space requirements and model maintenance because derivable facts have to be stored explicitly and checked for validity when the model is updated. In this paper we experimentally identify certain kinds of statements as the major sources for the increase. Based on this observation, we develop a new approach for RDF reasoning that only computes a small part of the implied statements offline thereby reducing space requirements, upload time and maintenance overhead. The computed fragment is chosen in such a way that the problem of inferring implied statements at run time can be reduced to a simple form of query re-writing. This new methods has two benefits: it reduces the amount of storage space needed and it allows to perform online reasoning without using a dedicated inference engine. A common way of reducing run time complexity of RDF Schema reasoning is to compute (parts of) the deductive closure of a model offline using the deduction rules specified in the RDF Semantics Specification [6] and work on the expanded model at query time. Most implementations of RDF reasoning use existing approaches like the RETE algorithm [3] that have originally been invented for Deductive Databases and Rule-based Expert systems. Obviously there is a trade-off between run-time complexity and the amount of space needed to store the deductive closure. In the first part of this paper we analyze these space requirements of computing the deductive closure using a number of large real-life RDF models and compare it to the minimal space needed for storing the model. We argue that the fact that existing algorithms for offline closure computation work quite well is mainly a consequence of the fact that the scenarios in which they were applied are still far away from the vision of Semantic Web reasoning as they work on a relatively small amount of centrally stored data. In a recent study Guo et al. revealed the limitations of current systems with respect to handling large amounts of data both in terms of upload and query time [4]. Another serious problem of closure computation is the need to recheck the validity of derived statements when the model is changed. This revision process is known to be very expensive in theory and in practice [2]. In the paper, we propose a novel strategy for RDF reasoning that combines offline computation based on an extensional semantics for RDF schema with a simple form of online reasoning. We evaluate our method with respect to space requirements and run-time behavior. The paper is organized as follows. In section 1 we briefly recall the foundations of RDF reasoning based on [5] focussing on a proof system for RDF schema as well as the notion of closure and reduction. In section 1.1 we compare the space complexity of closure and reduction for some real-life models and discuss the closure computation approach. In section 2 we introduce a new reasoning strategy for RDF schema and show its completeness and correctness. In section 3, we report the result of experiments with applying the approach to the data used in section 1.1. We close with a discussion of general space-time trade-offs and the specific characteristics of the proposed method. 1 Analysis of Space Requirements RDF models can be seen as a set of statements or as the graph induced by these statements. RDF schema models are RDF models where a subset of the triples use a designated vocabulary with a special meaning defined in the RDF Semantics Specification. The special meaning allows us to derive new statements. In the following, we briefly describe a proof system for RDF schema that has been proposed by [5]. The RDF semantics specification [6] does not only provide a model theoretic semantics for RDF and RDF schema, but also provides and alternative specification of the semantic in terms of a deduction system. The deduction system consists of the set of axioms about the nature of RDF and RDF schema statements and a set of inference rules that can be used to derive new statements from existing ones. We list these deduction rules below1, because we extensively refer to individual rules throughout the paper. For the set of axioms, we refer to the specification.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

RDFS & OWL Reasoning for Linked Data

Linked Data promises that a large portion of Web Data will be usable as one big interlinked RDF database against which structured queries can be answered. In this lecture we will show how reasoning – using RDF Schema (RDFS) and the Web Ontology Language (OWL) – can help to obtain more complete answers for such queries over Linked Data. We first look at the extent to which RDFS and OWL features ...

متن کامل

Ethical Perspective: Five Unacceptable Trade-offs on the Path to Universal Health Coverage

This article discusses what ethicists have called “unacceptable trade-offs” in health policy choices related to universal health coverage (UHC). Since the fiscal space is constrained, trade-offs need to be made. But some trade-offs are unacceptable on the path to universal coverage. Unacceptable choices include, among other examples from low-income countries, to expand coverage for services wit...

متن کامل

Some Trade-off Results for Polynomial Calculus

We present size-space trade-offs for the polynomial calculus (PC) and polynomial calculus resolution (PCR) proof systems. These are the first true size-space trade-offs in any algebraic proof system, showing that size and space cannot be simultaneously optimized in these models. We achieve this by extending essentially all known size-space trade-offs for resolution to PC and PCR. As such, our r...

متن کامل

Strider-lsa: Massive RDF Stream Reasoning in the Cloud

Reasoning over semantically annotated data is an emerging trend in stream processing aiming to produce sound and complete answers to a set of continuous queries. It usually comes at the cost of finding a trade-off between data throughput and the cost of expressive inferences. Striderlsa proposes such a trade-off and combines a scalable RDF stream processing engine with an efficient reasoning sy...

متن کامل

RDFS Reasoning and Query Answering on Top of DHTs

We study the problem of distributed RDFS reasoning and query answering on top of distributed hash tables. Scalable, distributed RDFS reasoning is an essential functionality for providing the scalability and performance that large-scale Semantic Web applications require. Our goal in this paper is to compare and evaluate two well-known approaches to RDFS reasoning, namely backward and forward cha...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2005